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ABSTRACT 

As  computers  are  used  for  increasingly  complex  operations  such  as 
retrieving  documents  and  analyzing  sentences,  it  becomes  apparent  that 
human  decision-making  is  still  an  essential  element  of  the  process.  The 
use  of  the  on-line  interactive  capability  of  today’s  third-generation 
computers  supported  by  typewriter  and  display  scope  * erminals  makes  the 
construction  of  computer-aided  systems  for  these  complex  tasks  an  attractive 
approach.  Two  such  systems  are  described  in  this  paper.  One  is  BOLD,  a 
document  retrieval  system  that  offers  the  user  an  on-line  browsing  capa¬ 
bility  as  well  as  the  ability  to  retrieve  documents  or  construct  biblio¬ 
graphies  using  computer -driven  display  scopes  arid  typewriters.  The  other 
i6  a  sentence-analysis  system  that  computes  dependency  analyses,  phrase 
structure  analyses  and  kernel  sets  for  each  sentence  it  is  given.  This 
system  produces  and  displays  multiple  analyses  and  allows  the  user  to 
correct  them  or  to  select  those  which  are  satisfactory. 

Our  conclusion  is  that  for  some  time  to  cane  <  aaplex  information 
processing  systems --particularly  those  concerned  with  natural  languages -- 
will  remain  at  the  level  of  semiautomatic  computer  aids  to  human  information 
processing.  As  such,  their  usefulness  can  be  maximized  by  optimal  use  of 
interactive  display  technology. 


I.  INTRODUCTION 

Using  computers  for  information  retrieval  ana  for  producing  syntactic 
analyses  has  been  a  frequent  but  often  frustrating  application  of  computa¬ 
tional  linguistics  technology.  Document  retrieval  systems  such  as  Salton's 
SMART;  the  several  formatted -data-base  querying  systems  such  as  ADAM,  LUCID, 
and  various  classified  military  applications;  arid  text  retrieval  approaches 
such  as  Protosynthex  have  ell  shown  a  great  potential  for  augmenting  human 
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abilities  to  manage  printed  and  tabular  data  in  military  information  systems. 
Underlying  these  query  systems,  there  is  often  the  still  only  partially 
satisfied  requirement  that  the  input  material  and  the  queries  to  the  system 
must  be  syntactically  analyzed — preferably  automatically. 

The  frustrating  aspect  of  these  applications  is  immediately  apparent 
to  every  user.  In  the  ordinary  retrieval  situation,  the  user  is  often  at 
a  loss  to  guess  appropriate  categories  under  which  an  indexer  or  librarian 
has  classified  material.  He  may  be  ignorant  of  the  format  and  restrictions 
on  the  query  language  that  the  information  system  requires.  If  automated 
syntactic  analysis  is  a  feature  (as  in  some  research  systems),  he  will 
certainly  be  frustrated  by  multiple  interpretations--often  incorrect--of 
the  sentences  or  queries. 

Today's  resolution  for  both  of  these  problems  requires  that  the  infor¬ 
mation  systems  be  interactive  and  heavily  supported  by  rapid  display  systems. 

In  the  retrieval  context,  the  display  system  allows  the  user  to  see  the 
thesaurus  that  guides  an  indexer's  choice  of  terms  for  classifying  and  indexing 
documents.  He  may  use  this  information  in  conjunction  with  his  knowledge  of 
what  he  is  looking  for  to  locate  a  set  of  relevant  documents.  By  use  of  a 
browsing  mode— being  able  to  scan  sets  of  titles  and  abstracts  indexed  to¬ 
gether— with  the  aid  of  a  display  scope,  the  user  can  get  almost  as  satisfying 
use  out  of  a  magnetic  tape  library  as  he  could  by  visiting  the  document  or 
fact  collection.  In  querying  tabular  data  he  can  use  a  computer-controlled 
display  scope  to  build  graphs  and  charts  semiautomat ically.  In  those  cases 
where  syntactic  analysis  is  required,  be  can  select  from  a  display  scope  the 
correct  syntactic  analysis  and  greatly  minimize  the  errors  made  by  current 
inadequate  parsing  algorithms. 

Two  such  interactive  display  systems  recently  developed  at  SDC,  BOLD, 
for  bibliographic  on-line  display,  and  PLP  II,  a  parsing  component  for  a 
text  retrieval  system,  are  described  briefly  in  the  following  pages. 


II.  INTERACTIVE  BROWSING  AND  DOCUMENT  RETRIEVAL 

The  Bibliographic  On-Line  Display  system,  BOLD,  was  developed  at  System 
Development  Corporation  by  Harold  Borko  and  Howard  Burnaugh  (Borko,  1965- 
Burnaugh,  19 66).  The  problem  that  was  attackfd  in  the  design  and  programing 
'■T  BOLD  was  that  of  providing  a  user  some  of  *be  same  capabilities  in  using 
a  computerized  retrieval  system  that  he  finds  helpful  in  a  library.  For 
example,  in  the  ordinary  library  the  card  file  offers  titles,  index  terms 
author  names,  and  subject  classification  headings  as  means  for  finding  the 
desired  documents.  Once  in  the  stacks,  a  user  can  browse  through  books  in 
the  near  neighborhood  of  any  ones  that  the  card  file  directed  him  to. 
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In  a  computerized  retrieval  system,  on  the  other  hand,  the  user  is  often 
restricted  to  making  queries  in  a  formalized  mode  and,  usually  with  little 
control  of  the  process,  receives  a  batch  of  document  numbers  that  may,  or 
may  net,  satisfy  his  needs.  Our  approach  to  easing  this  problem  in  the  use 
of  automated  systems  has  been  to  develop  means  for  allowing  the  user  to 
obtain  greater  control  of  the  retrieval  process  through  the  use  of  an  on-line 
time-shared  computer  supported  by  teletypes  and  display  scopes.  In  this 
system  the  user  controls  many  aspects  of  the  process  of  discovering  what  his 
request  means,  in  terms  of  numbers  and  kinds  of  documents  that  may  be  retrieved 
in  response  to  his  query.  With  the  aid  of  such  feedback,  the  user  may  modify 
his  request  until  his  intermediate  displays  allow  him  to  be  relati/ely  certain 
that  he  will  receive  an  appropriate  number  and  kind  of  documentr  as  a  response. 

An  illustrated  example  of  a  user's  attempt  to  retrieve  documents  with 
the  BOLD  system  will  show  the  utility  of  some  of  the  interactive  features. 

As  soon  as  the  system  is  loaded,  the  display  scope  presents  the  major  classi- 
fiction  headings  used  in  the  collection.  A  classification  used  in  the  ASTXA 
Thesaurus  is  shown  in  Figure  1.  The  user  may  begin  by  investigating  the 
classification  system  in  depth.  To  do  this  he  uses  a  light  pen  on  any  part 
of  the  line  associated  with  a  Division  that  he  wishes  to  explore.  Firing  the 
light  pen  at  "Div.  4,  Chemistry"  in  Figure  1  resulted  in  the  display  shown 
in  Figure  2.  By  light-penning  "Chemical  Analysis"  in  Figure  2,  the  display 
of  Figure  3  is  obtained.  In  this  fashion  the  user  can  explore  the  classifi¬ 
cation  system  that  the  indexer  ised  to  classify  documents  in  the  collection. 

The  user  has  additional  options.  He  may  shift  to  the  teletype  to  discover 
descriptors;  he  may  add  hierarchical  terms  or  synonyms  to  the  classification 
scheme  or  to  the  dictionary  or  he  may  query  to  discover  how  nary  documents  are 
referenced  by  a  given  descriptor,  author,  publisher,  etc.  If  he  has  a  term  In 
mind  such  as  the  word  "heat,"  he  may  ask  the  system  to  display  similar  terms 
as  follows: 


HEAT? 


The  system  responds: 


similar  to  heat 


The  following  may  be 
heat 

thermodynamics 
enthalpy 
heat  exchangers 
heat  of  formation 
heat  of  fusion 
heat  of  sublimation 
heat  production 


heat  resistant  alloys 

heat  resistant  polymers 

heat  transfer 

heat  treatment 

heaters 

heating 

♦end 
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Figure  1,  Display  of  Category  Titles 
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The  additional  terms  will  suggest  alternate  ways  of  phrasing  his  request 
and,  in  another  fashion,  bring  the  user  into  closer  coordination  with  the 
unknown  indexer.  With  this  information  the  user  may  further  explore  the 
system  to  discover  how  many  documents  are  indexed  under  each  term.  For 
example,  he  may  make  the  queries  separately  using  a  colon  as  a  delimiter  to 
indicate  he  wishes  to  use  the  term  be  input,  and  only  that  term,  a6  a  descriptor: 

HEATERS: 

Tbe  system  responds: 

1  entries  are  referenced  by  heaters 
♦end 

He  may  instead  wish  to  use  the  term  and  all  its  variant  forms  as  follows  by 
not  using  a  colon  delimiter: 

HEAT 

The  system  responds: 

6  entries  are  referenced  by  heat 

1  entries  are  referenced  by  heaters 

2  entries  are  referenced  by  heating 
♦end 

In  a  similar  fashion  he  may  investigate  properties  of  all  the  descriptors  under 
which  he  thinks  his  documents  might  be  found.  He  may  use  the  same  approach 
with  such  items  as  author  names,  document  titles,  corporation  authors,  and  any 
other  aspects  of  a  document  that  have  been  recorded. 

When  ready  to  request  documents  the  user  may  shift  the  system  to  u  search 
mode  by  typing  SEARCH.  At  this  point  he  may  enter  his  retrieval  request  in 
the  following  syntax: 


a  and  b  and  not  c  or  d 

where  the  letters  are  any  descriptor  terms.  If  no  connectors  such  as  "and"  "or" 
are  used,  the  system  interprets  the  blank  between  descriptor  terms  aT  an  "or". 
The  following  query: 

HEAT  EXCHANGERS  OR  BOUNDARY  LAYER  OR  HEAT  TRANSFER  OR  VAPORS  OR  THEORY 

results  in  the  Search  Mode  display  shown  in  Figure  4,  This  display  shows  the 
user  how  many  documents  are  referenced  by  combinations  of  the  descriptor  terms 
Using  light-per  actions  or  teletype  actions  he  may  then  obtain  such  information 


Figure  U.  Search  Mode  Display 
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as  titles  and/or  authors,  for  documents  in  the  set,  as  shown  in  Figure  5>  or  he 
may  obtain  abstracts,  as  shown  in  Figure  6.  He  may  obtain  hard  copy  correspond¬ 
ing  to  any  of  these  displays  by  a  teletype  request.  The  hard  copy  may  be 
printed  on  his  teletype  or  printed  later  from  a  magnetic  tape  in  an  off-line 
mode. 

lhe  system  works  almost  instantaneously,  even  in  the  time-shared  mode  with 
fifteen  or  more  other  users  of  the  computer.  It  is  designed  to  deal  with  up  to 
100,000  documents  but  has  so  far  been  tested  on  samples  of  only  1000.  Detailed 
descriptions  of  the  system  and  its  programming  features  can  be  found  in  Burnaugl 
(1966a,  1966b).  So  far,  our  efforts  with  BOLD  have  been  devoted  to  developing 
it  as  a  complete  interactive  retrieval  system  engineered  to  enable  a  user  to  ha^i 
a  nonfrustrating  experience  in  using  a  computer-based  document  library.  Researc 
in  the  future  will  be  oriented  to  testing  it  in  actual  live  situations  where 
users  wish  to  retrieve  documents,  produce  bibliographies  or  browse  among  the 
documents  in  a  collection.  It  is  this  research  that  will  reveal  how  successful 
the  approach  is  and  that  will  derive  the  information  necessary  to  the  further 
human  engineering  of  the  system. 


Figure  5.  Browse  Mode  Display  for  Requested  Authors  and  Titles 
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Figure  6.  Browse  Mode  Display  for  an  Abstract  Request 
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III.  AN  INTERACTIVE  SENTENCE  ANALYSER 

Ihe  BOLD  system  Is  an  example  of  fairly  highly  developed  document  retrieval 
technology  in  which  interactive  displays  are  used  to  permit  a  man  to  protect 
himself  from  the  literal-mindedness  of  a  programmed  computer.  The  language 
used  in  such  systems  is  a  simple  one  made  up  of  descriptor  terms  and  conjunctive^ 
which  are  themselves  English  words  or  terms.  The  syntax  of  this  language  is 
ultimately  simple;  it  includes  two  or  three  delimiters  and  a  rule  for  bounding 
descriptor  terms  either  by  delimiters  or  conjunctives.  However,  in  many 
advanced  research  attempts  to  develop  question-answering  systems,  the  language 
is  some  subset  of  a  natural  language  such  as  English,  French,  or  Russian.  The 
data,  instead  of  being  formatted  document  titles,  authors,  abstracts,  etc., 
may  be  in  the  form  of  constrained  or  free-flowing  text.  As  a  consequence,  the 
syntax 'that  such  systems  must  deal  with  approaches  the  complexity  of  that  found 
in  natural  languages . 

For  this  reason,  an  essential  feature  of  a  fact-retrieval  or  question¬ 
answering  system  is  a  powerful  statement  analyzer.  The  systems  that  deal  with 
natural  languages  require  a  syntactic  analysis  procedure  that  can  handle  large 
subsets  of  natural  language  constructions.  The  problem  is  that  in  ten  years 
of  research  history,  although  many  automatic  parsers  have  been  developed,  none 
can  be  depended  on  to  give  all  the  correct  analyses  for  a  sentence,  and  even 
the  best  of  them  (such  as  Zwicky,  £t  al. ,  1965;  Kuno,  1965;  Clarke  &  Wall, 

1965)  analyze  some  or  many  sentences  incorrectly. 

Our  own  research  in  this  area  has  led  us  through  many  attempts  to  construct 
wholly  automatic  parsers  that  would  produce  a  single  correct  analysis  for  each 
of  a  large  subset  of  English  sentences.  In  each  case  we  failed  to  achieve  our 
goal  and  finally  concluded  that,  until  many  advances  have  been  made  in  com¬ 
puter  approaches  to  deriving  the  meanings  of  words,  no  completely  automatic 
parser  could  be  constructed.  Our  response  to  this  finding  was  to  build  a 
system  that  works  together  with  a  person  to  derive  syntactic  analyses  for 
English  sentences.  We  called  this  system  PLP-II,  since  it  bore  many  resem¬ 
blances  to  the  Pattern  Learning  Parser  developed  and  described  by  McConiogue  and 
Simmons  (1965). 

PLP-II  is  programmed  in  LISP  l.j>  and  offers  several  unique  and  interesting 
features.  First,  its  input  is  in  the  form  of  already  parsed  sentences.  Fran 
the  sentences  it  has  experienced,  the  system  derives  vocabulary  and  grammar 
rules  that  it  applies  to  new  sentences  of  similar  structure.  Our  philosophy 
here  is  that  it  is  far  easier  to  develop  a  consistent  grammar  by  having  a 
computer  system  derive  it  from  parsed  sentences  than  to  develop  the  grammar 
by  ourselves,  making  a  linguistic  analysis  of  a  large  corpus  of  English. 
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A  second  feature  is  that  it  produces  and  displays  both  a  dependency 
analysis  and  a  labelled  phrase-structure  tree  for  each  sentence  that  it  can 
parse.  A  third  important  feature  is  that  it  produces  kernel  sentences — one 
for  each  deep-structure  sentence  string  that  may  be  presumed  to  underlie  the 
surface  syntactic  structure  of  the  sentence  (see  Chomsky,  19^5).  A  final  and 
essential  feature  of  PL?- II  is  that  it  is  on-line  on  a  time-shared  computer, 
and  users  have  the  freedom  to  add  and  delete  grammar  and  vocabulary,  to  correct 
the  analyses  the  system  makes,  and  to  select  the  parsings  that  are  intuitively 
correct  for  the  user.  It  is  our  belief  that  for  many  years  to  come,  only  such 
a  machine-aided  approach  can  be  used  to  obtain  correct  analyses  of  text  in  a 
computer  system. 

Sentences  are  input  to  the  system  in  the  following  fashion: 

i 

#1  THE  OLD  MAN  SAT  ON  THE  BEACH  . 

ART  ADJ  N  V  PREP  ART  N  . 

N  N  V  *  *V  N  *PR£P  . 


This  is  in  the  form  of  three  strings  where  the  first  is  the  list  of  English  words 
in  the  sentence,  the  second  Is  the  corresponding  list  of  their  parts  of  speech 
or  word-classes,  and  the  third  is  the  list  that  shows  what  word -class  each  word 
in  the  sentence  is  dependent  on.  Readers  familiar  with  dependency  analysis  will 
ooserve  in  example  #1  that  the  ART  (article)  is  dependent  on  the  following  noun, 
the  N  (noun)  is  dependent  on  the  following  verb,  the  V  (verb)  is  the  head  of 
the  sentence  as  indicated  by  the  asterisk  and  the  PREP  (preposition'is  depend¬ 
ent  backwards  on  the  verb  as  shown  by  Its  symbol,  *V. 

Using  information  from  these  three  strings  the  system  augments  its 
dictionary  with  additional  vocabulary  and  word-clas6  items  and  additional 
(grammar  rules.  The  dictionary  entry  that  is  constructed  for  each  word  is  in 
the  form  of  a  set  of  4-tuples,  each  of  which  shows  word-classes  for  the 
preceding  word,  for  the  word  itself,  and  for  the  word  that  followed  it.  The 
fourth  piece  of  information  is  the  word-class  on  which  the  word  was  dependent 
when  it  was  in  the  context  shown  by  the  preceding  three  terms  of  the  4-tuple. 

For  example,  from  sentence  #1  above,  the  word  "man”  would  develop  a  4-tuple 
as  f ollow8 : 


MAN:  ADJ-N-V,  V  . 

This  shows  that  the  word  has  been  a  noun  in  a  context  where  it  was  preceded 
by  an  adjective  and  followed  by  a  verb.  In  this  context  it  was  dependent  cn 
a  verb.  As  a  result  of  being  seen  in  several  contexts,  each  word  develops 
a  set  of  such  frames.  The  frames  are  primarily  useful  in  selecting  a  single 
word-class  as  a  function  of  context.  Thus,  for  a  word  that  can  be  a  noun, 
a  verb,  or  an  adjective,  the  context  in  which  it  is  found  is  often  decisive 
in  selecting  only  one  of  the  possibilities. 
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In  parsing  node,  ths  system  is  given  Just  an  English  sentence  as  an  Input. 
For  example,  the  following  sentence  was  Input  to  the  system: 

THE  BOOK  THAT  YOU  READ  18  ON  THE  TABLE  IN  THE  HALL  . 

PLP-II  looks  up  each  word  In  Its  dictionary  and  obtains  for  each  the  set  of 
4-tuple  frames  that  It. has  so  far  accumulated.  Generally  It  finds  a  set  of 
3-10  such  frames  for  each  word.  The  set  of  frames  for  all  the  words  in  the 
sentence  may  be  conceived  of  as  a  matrix  of  possible  strings  of  word-classes 
to  characterize  the  sentence.  Using  the  information  provided  by  preceding  and 
following  word-classes,  the  system  is  able  to  discard  most  such  strings  as 
inconsistent  vith  the  present  context.  It  is  also  able  to  use  context  cues 
to  calculate  word-classes  of  words  that  were  not  in  the  dictionary.  It  does 
this  by  predicting,  from  the  word-class  contexts  of  the  preceding  and  following 
words,  what  classes  the  word  in  question  can  be. 

The  result  of  this  phase  of  the  system's  operation  is  to  develop  a  set 
of  strings  of  the  form  shown  in  example  #2,  below.  Baing  told  what  word-class 
a  word  is  and  what  word-class  governs  it,  the  parsing  phase  has  the  task  of 
determining  actual  dependency  relationships  between  pairs  of  English  words. 

An  essential  feature  of  the  parser's  logic  is  the  use  of  a  pushdown  list  which 
we  will  refer  to  as  PLL. 

He  following  example  will  help  to  explain  the  operation  of  the  parsing 
phase. 


#2  THE  BOOK  THAT  YOU  READ  IS  ON  THE  TABLE  . 
ART  N  RPRON  PRON  V  V  PREP  ART  N  . 

N  V  V  V  *N  *  *V  N  *PREP  . 


word  string 

word -class  string 

dependency  requirement  string 


"THE"  is  an  article  looking  for  a  noun  to  govern  it.  The  code,  ART,  is 
put  on  the  pushdown  stack  and  the  next  word  is  examined  to  discover  if  it  meets 
the  dependency  requirement  associated  with  that  article.  The  next  word  is  a 
noun  and  does  meet  the  requirement,  so  "THE"  is  made  dependent  on  "BOOK".  Hie 
next  term  in  the  word-class  string  is  N,  which  is  put  on  the  pushdown  list. 

It  requires  a  V  for  its  governor.  Since  the  immediately  following  term  is' 
not  a  V  but  a  RPRON  (the  symbol  for  relative  pronoun),  RPRON  is  placed  on  the 
PDL.  RPRGN  is  seeking  a  V  as  its  governor  but  the  next  word-class  is  PRON 
(pronoun).  PRON  in  its  turn  is  put  on  the  PDL.  The  contents  of  the  PDL  is 
now  the  following  set  of  2- tuples:  PRON-V,  RPRON- V,  N-V.  (The  first  term  is 
the  word-class;  the  second  is  the  class  of  its  governor, )  The  next  word-class 
in  succession  is  V,  which  satisfies  the  PRON  on  top  of  the  PDL.  The  dependency 

?!lrr£?UJREAD  i8,( conBtructed  (wher*  th*  term  is  always  the  governor)  and 

the  PDL  is  popped  bringing  RPRON  to  the  top.  The  RPRON  is  also  seeking  a  V 
which  it>  still  next  in  the  sequence  of  word-classes  and  so  the  pair  THAT- READ 
is  constructed.  The  word-class  N  for  BOOK  is  now  on  top  of  the  PDL  and  is 
looking  forward  for  the  V,  which  is  next  in  sequence.  However  the  V  ia  looking 
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backward  for  a  noun  as  symbolized  by  its  dependency  requirement  *N  and  this 
requirement  takes  precedence.  Thus  the  pair  READ- BOOK  is  constructed.  Since 
BOOK  is  still  seeking  a  V  to  govern  it,  its  symbol,  N,  is  not  popped  from 
the  PDL.  Having  found  a  governor  for  READ,  the  next  word-class  is  the  V  for 
IS.  This  satisfies  the  requirement  for  the  N  at  the  top  of  the  PDL,  the  pair 
BOOK-IS  is  constructed  and  V,  the  word-class  of  IS,  is  put  on  the  PDL.  This 
V  is  the  head  of  the  sentence  as  shown  by  its  dependency  requirement,  "*",  so 
the  pair  IS-*  is  constructed  and  the  V  is  removed  from  the  stack.  The  next 
word-class  in  sequence  is  a  PREP  looking  for  a  *V,  which  it  immediately  finds, 
and  the  pair  ON- IS  is  constructed.  In  a  similar  fashion,  the  pairs  THE -TABLE 
and  TABLE-ON  complete  the  example.  (A  more  detailed  description  of  the  opera¬ 
tion  of  the  system  is  available  in  Burger,  e£  aJt*,  1966). 

Figure  7  b  Dhotograph  of  a  display  of  the  system's  output  for  the 
similar  sentence.  THE  BOOK  THAT  YOU  READ  IS  ON  THE  TABLE  IN  THE  HALL.  Each 
element  of  the  display  is  a  5-tuple  in  which  the  first  term  is  the  sequence 
number  of  the  word  in  the  sentence,  the  second  is  the  word  itself,  the  third 
is  its  word-class,  the  fourth  is  the  word  that  governs  it,  and  the  fifth  is 
the  sequence  number  of  the  governing  word.  The  user  may  examine  this  display 
to  decide  if  the  dependency  analysis  is  correct.  If  not,  he  has  three  choices 
Assuming  this  is  one  of  several  analyses  the  system  produced  for  the  sentence, 
he  may  display  the  next  analysis.  If  he  wishes,  he  may  correct  the  analysis 
by  typing  "FIX"  followed  by  a  set  of  3-tuples  indicating  the  sequence  number 
of  the  word  to  be  corrected,  its  word-class,  and  the  sequence  number  of  its 
governor.  As  a  third  alternative,  he  may  reject  the  parsing  entirely  and  go 
back  to  the  input  mode  to  give  appropriate  grammatical  and  lexical  information 
directly. 

Instead  of  examining  this  display,  the  user  may  prefer  to  call  up  an 
immediate  constituent  tree.  The  tree,  corresponding  to  the  analysis  of 
Figure  7j  is  shown  as  Figure  8.  Such  a  phrase -structure  tree  is  automatically 
constructed  from  the  dependency  analysis  information  with  the  aid  of  a  brief 
phrase -structure  grammar  whose  rules  are  of  the  usual  form  "NP-ART+N"  "S-NP  + 
VP",  etc.  As  in  other  phases  of  the  system,  additions,  deletions,  or  modifi¬ 
cations  of  the  phrase  structure-rules  may  be  made  on-line  from  the  teletype 
as  required. 

Having  obtained  a  phrase -structure  analysis  of  a  sentence,  the  system 
now  translates  it  into  a  form  that  is  useful  in  question-answering  systems. 

The  form  is  that  of  a  3-tuple  kernel  where  the  first  and  third  terms  are 
nominals  and  the  second  is  a  relation.  The  kernels  for  example  sentence  #2 
are  shown  as  a  photograph  of  the  display  scope  in  Figure  9.  In  addition  to 
the  display,  hard  copy  is  available  from  the  teletype  terminal  or  from  an 
off-line  printing  of  a  magnetic  tape.  Details  of  computing  the  kernels  are 
described  in  Burger,  ^et  al. (1966)  and  our  approach  to  using  them  is  answering 
questions  from  text  is  discussed  in  Simmons,  et  al.  (1966). 
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Figure  7*  A  Dependency  Analysis 
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Figure  8.  A  Phase  Structure  Analysis 


Figure  9.  Kernel  Analysis 
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Figure  9.  Kernel  Analysis 
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IV.  CONCLUSIONS 

In  terms  of  our  personal  experiences  with  document  retrieval  and 
sentence  analysis  systems,  the  conclusion  is  inescapable  that  on-line 
interaction  with  a  computer  system  augments  the  capabilities  of  both 
the  man  and  the  machine.  The  operations  that  are  well  enough  understood 
to  be  completely  automatic  include  basic  mathematical  calculations,  sort¬ 
ing  and  searching  of  large  files,  maintaining  consistency  of  dictionaries 
and  grammars,  and,  in  general,  simple  data  processing  manipulations  on 
large  files.  Such  operations  as  optimizing  a  choice  of  descriptor  terms 
used  in  a  query,  selecting  an  intuitively  best  parsing  from  several 
choices,  or  evaluating  the  response  to  a  query  are  all  far  more  complex 
decisions  that  benefit  from  a  computer's  assistance  in  data  processing 
but  depend  in  the  last  analysis  on  a  human  judgment. 

Immediate  responses  from  a  computer  through  on-line  typewriters 
or  teletypes  and  CRT  display  scopes  make  the  result  of  the  computer's 
data  processing  operations  conveniently  available  to  a  human  user  in  a 
form  such  that  his  decision  is  simplified  and  can  be  made  with  increased 
rapidity.  As  a  result,  such  tasks  as  finding  a  relevant  subset  of  docu¬ 
ments  from  a  large  collection  become  manageable.  In  syntactic  analysis 
the  computer-aided  system,  in  comparision  to  wholly  automatic  parsers, 
proves  to  greatly  reduce  the  labor  of  human  analysis  and  offers  the 
advantage  of  human  review  and  correction  of  the  computer  parsing. 

Although  these  conclusions  seem  evident  to  us  from  our  own  experi¬ 
ence,  a  wider  renge  of  users  for  both  the  retrieval  and  sentence  analysis 
systems  is  desired.  On-line  interaction  via  typewriters  and  display 
scopes  in  these  areas  is  still  such  a  new  venture  that  considerable  ad¬ 
vances  in  discovering  optimal  configurations  of  terminal  equipment  and 
optimal  human  engineering  of  the  interaction  capabilities  of  the  program 
systems  can  be  expected  as  a  result  of  wider  experience. 
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abstract  As  computers  are  used  for  increasingly  complex  operations  such  as 
retrieving  documents  and  analyzing  sentences,  it  becomes  apparent  that  human 
decision-making  is  still  an  essential  element  of  the  process.  The  use  of  the  on¬ 
line  interactive  capability  of  today's  third-generation  computers  supported  by 
typewriter  and  display  scope  terminals  makes  the  construction  of  computer-aided 
systems  for  these  complex  tasks  an  attractive  approach.  Two  such  systems  are 
described  in  this  paper.  One  is  BOLD,  a  document  retrieval  system  that  offers  the 
user  an  on-line  browsing  capability  as  well  as  the  ability  to  retrieve  documents 
or  construct  bibliographies  using  computer-driven  display  scopes  and  typewriters. 
The  other  is  a  sentence-analysis  system  that  computes  dependency  analyses,  phrase 
structure  ana1yses  and  kernel  sets  for  each  sentence  it  is  given.  This  system 
produces  and  displays  multiple  analyses  and  allows  the  user  to  correct  them  or 
to  select  those  which  are  satisfactory.  Our  conclusion  is  that  for  some  time  to 
come  complex  information  processing  systems — particularly  those  concerned  with 
natural  languages — will  remain  at  the  level  of  semiautomatic  computer  airs  to 
human  information  processing.  As  such,  their  usefulness  can  be  maximized  by 
optimal  use  of  interactive  display  technology. 
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